15 November 2022
# A tsibble: 168 x 2 [1M]
Month Turnover
<mth> <dbl>
1 2006 Jan 1914.
2 2006 Feb 1750
3 2006 Mar 1984.
4 2006 Apr 1967.
5 2006 May 2005.
6 2006 Jun 1944.
7 2006 Jul 2019.
8 2006 Aug 2043.
9 2006 Sep 2039.
10 2006 Oct 2113.
# … with 158 more rows
auscafe |>
filter(year(Month) <= 2017) |>
model(
ets = ETS(Turnover),
arima = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
snaive = SNAIVE(Turnover)
) |>
mutate(ensemble = (ets + arima + snaive)/3)# A mable: 1 x 4
ets arima snaive ensemble
<model> <model> <model> <model>
1 <ETS(M,A,M)> <ARIMA(0,1,1)(0,1,1)[12]> <SNAIVE> <COMBINATION>
auscafe |>
filter(year(Month) <= 2017) |>
model(
ets = ETS(Turnover),
arima = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
snaive = SNAIVE(Turnover)
) |>
mutate(ensemble = (ets + arima + snaive)/3) |>
forecast(h = "2 years")# A fable: 96 x 4 [1M]
# Key: .model [4]
.model Month Turnover .mean
<chr> <mth> <dist> <dbl>
1 ets 2018 Jan N(3753, 4335) 3753.
2 ets 2018 Feb N(3434, 4937) 3434.
3 ets 2018 Mar N(3801, 7735) 3801.
4 ets 2018 Apr N(3736, 9189) 3736.
5 ets 2018 May N(3782, 11269) 3782.
6 ets 2018 Jun N(3666, 12416) 3666.
7 ets 2018 Jul N(3876, 16022) 3876.
8 ets 2018 Aug N(3923, 18713) 3923.
9 ets 2018 Sep N(3878, 20630) 3878.
10 ets 2018 Oct N(4010, 24681) 4010.
# … with 86 more rows
auscafe |>
filter(year(Month) <= 2017) |>
model(
ets = ETS(Turnover),
arima = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
snaive = SNAIVE(Turnover)
) |>
mutate(ensemble = (ets + arima + snaive)/3) |>
forecast(h = "2 years") |>
accuracy(
data = auscafe,
measures = list(rmse=RMSE)
) |>
arrange(rmse)# A tibble: 4 × 3
.model .type rmse
<chr> <chr> <dbl>
1 ensemble Test 28.7
2 ets Test 88.2
3 arima Test 111.
4 snaive Test 174.
According to the democratic principle of “one vote one value”, the middlemost estimate expresses the vox populi.
“The combined forecasting methods seem to me to be non-starters . . . Is a combined method not in danger of falling between two stools?” Maurice Priestley
The first genuine forecasting competition
The first genuine forecasting competition
Conclusions
The first genuine forecasting competition
Consequences
Researchers started to:
Winning methods
Suppose we have N different forecasting methods.
\hat{\bm{y}}_{T+h|T} = N-vector of h-step-ahead forecasts using information up to time T.
\bm{\Sigma}_{T+h|T}= N\times N covariance matrix of the h-step forecast errors.
\bm{1}= unit vector.
Simple combination: \tilde{y}_{T+h|T} = N^{-1}\bm{1}'\hat{\bm{y}}_{T+h|T}
Linear combination: \tilde{y}_{T+h|T} = \bm{w}'_{T+h|T}\hat{\bm{y}}_{T+h|T}
Optimal combination (minimizing MSE):
\bm{w}_{T+h|T} = \frac{\bm{\Sigma}_{T+h|T}^{-1}\bm{1}}{\bm{1}'\bm{\Sigma}_{T+h|T}^{-1}\bm{1}'} (Bates & Granger, 1969; Granger & Newbold, 1974)
Regress y_{t+h} against \hat{\bm{y}}_{t+h|t} for t=1,\dots,T.
Simple average combinations almost always give more accurate forecasts than other combination methods.
Feature-based FORecast Model Averaging
Ensemble forecasting: mix the forecast distributions from multiple models.
Ensemble forecasting: mix the forecast distributions from multiple models.
Let F_{i,T+h|T}(y) be the h-step forecast distribution for the ith method. Then \begin{align*} \tilde{F}_{T+h|T}(y) &= \sum_{i=1}^N w_iF_{i,T+h|T}(y) \\ \tilde{\mu} &= \sum_{i=1}^N w_i\mu_i \\ \tilde{\sigma}^2 &= \sum_{i=1}^N w_i\sigma_i^2 + \sum_{i=1}^N w_i(\mu_i - \tilde{\mu})^2 \end{align*}
More diverse forecasts may inflate the variance, but mitigates against over-confidence.
\begin{align*} f_{p,t} &= \text{quantile forecast with prob. $p$ at time $t$.}\\ y_{t} &= \text{observation at time $t$} \end{align*}
Q_{p,t} = \begin{cases} 2(1 - p) \big|y_t - f_{p,t}\big|, & \text{if $y_{t} < f_{p,t}$}\\ 2p \big|y_{t} - f_{p,t}\big|, & \text{if $y_{t} \ge f_{p,t}$} \end{cases}
NULL
# A tibble: 4 × 4
.model .type rmse crps
<chr> <chr> <dbl> <dbl>
1 arima Test 79.1 48.5
2 ensemble Test 29.7 39.6
3 ets Test 61.4 41.3
4 snaive Test 116. 70.6
# A tsibble: 1,344 x 3 [1M]
# Key: State [8]
Month State Turnover
<mth> <chr> <dbl>
1 2006 Jan ACT 21.7
2 2006 Feb ACT 21.9
3 2006 Mar ACT 24.9
4 2006 Apr ACT 24.8
5 2006 May ACT 27
6 2006 Jun ACT 24.5
7 2006 Jul ACT 24
8 2006 Aug ACT 26.1
9 2006 Sep ACT 26.2
10 2006 Oct ACT 33.7
# … with 1,334 more rows
# A tsibble: 1,248 x 3 [1M]
# Key: State [8]
Month State Turnover
<mth> <chr> <dbl>
1 2006 Jan ACT 21.7
2 2006 Feb ACT 21.9
3 2006 Mar ACT 24.9
4 2006 Apr ACT 24.8
5 2006 May ACT 27
6 2006 Jun ACT 24.5
7 2006 Jul ACT 24
8 2006 Aug ACT 26.1
9 2006 Sep ACT 26.2
10 2006 Oct ACT 33.7
# … with 1,238 more rows
cafe |>
filter(year(Month) <= 2018) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
)# A mable: 8 x 5
# Key: State [8]
State ETS ARIMA SNAIVE COMB
<chr> <model> <model> <model> <model>
1 ACT <ETS(M,Ad,M)> <ARIMA(2,1,2)(0,1,1)[12]> <SNAIVE> <COMBINATION>
2 NSW <ETS(M,Ad,M)> <ARIMA(0,1,1)(0,1,2)[12]> <SNAIVE> <COMBINATION>
3 NT <ETS(A,N,A)> <ARIMA(1,1,0)(1,1,0)[12]> <SNAIVE> <COMBINATION>
4 QLD <ETS(M,Ad,M)> <ARIMA(0,1,1)(1,1,1)[12]> <SNAIVE> <COMBINATION>
5 SA <ETS(M,Ad,M)> <ARIMA(0,1,1)(0,1,1)[12]> <SNAIVE> <COMBINATION>
6 TAS <ETS(A,N,A)> <ARIMA(0,1,1)(0,1,1)[12]> <SNAIVE> <COMBINATION>
7 VIC <ETS(M,Ad,M)> <ARIMA(0,1,2)(0,1,2)[12]> <SNAIVE> <COMBINATION>
8 WA <ETS(M,Ad,M)> <ARIMA(0,1,1)(0,1,2)[12]> <SNAIVE> <COMBINATION>
cafe |>
filter(year(Month) <= 2018) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "1 year")# A fable: 384 x 5 [1M]
# Key: State, .model [32]
State .model Month Turnover .mean
<chr> <chr> <mth> <dist> <dbl>
1 ACT ETS 2019 Jan N(61, 9.6) 60.7
2 ACT ETS 2019 Feb N(64, 15) 63.8
3 ACT ETS 2019 Mar N(72, 26) 71.9
4 ACT ETS 2019 Apr N(67, 28) 67.0
5 ACT ETS 2019 May N(70, 36) 69.8
6 ACT ETS 2019 Jun N(67, 39) 67.3
7 ACT ETS 2019 Jul N(68, 46) 68.4
8 ACT ETS 2019 Aug N(70, 53) 69.7
9 ACT ETS 2019 Sep N(69, 57) 68.5
10 ACT ETS 2019 Oct N(70, 66) 70.2
# … with 374 more rows
cafe |>
filter(year(Month) <= 2018) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "1 year") |>
accuracy(data = cafe,
measures = list(crps=CRPS, rmse=RMSE)
)# A tibble: 32 × 5
.model State .type crps rmse
<chr> <chr> <chr> <dbl> <dbl>
1 ARIMA ACT Test 1.64 2.23
2 ARIMA NSW Test 18.4 28.4
3 ARIMA NT Test 2.19 3.89
4 ARIMA QLD Test 15.0 24.9
5 ARIMA SA Test 4.06 6.70
6 ARIMA TAS Test 1.52 2.70
7 ARIMA VIC Test 30.4 48.6
8 ARIMA WA Test 9.06 14.8
9 COMB ACT Test 2.02 3.31
10 COMB NSW Test 17.8 14.8
# … with 22 more rows
cafe |>
filter(year(Month) <= 2018) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "1 year") |>
accuracy(data = cafe,
measures = list(ss=skill_score(CRPS))
)# A tibble: 32 × 4
.model State .type ss
<chr> <chr> <chr> <dbl>
1 ARIMA ACT Test 0.465
2 ARIMA NSW Test 0.331
3 ARIMA NT Test -0.359
4 ARIMA QLD Test 0.402
5 ARIMA SA Test 0.213
6 ARIMA TAS Test 0.438
7 ARIMA VIC Test -0.813
8 ARIMA WA Test 0.117
9 COMB ACT Test 0.340
10 COMB NSW Test 0.355
# … with 22 more rows
cafe |>
filter(year(Month) <= 2018) |>
model(
ETS = ETS(Turnover),
ARIMA = ARIMA(Turnover ~ pdq(d=1) + PDQ(D=1)),
SNAIVE = SNAIVE(Turnover)
) |>
mutate(
COMB = (ETS+ARIMA)/2
) |>
forecast(h = "1 year") |>
accuracy(data = cafe,
measures = list(ss=skill_score(CRPS))
) |>
group_by(.model) |>
summarise(sspc = mean(ss) * 100)# A tibble: 4 × 2
.model sspc
<chr> <dbl>
1 ARIMA 9.91
2 COMB 14.6
3 ETS 6.23
4 SNAIVE 0
Skill score is relative to seasonal naive forecasts